Goto

Collaborating Authors

 task-specific parameter






MetaCaDI: A Meta-Learning Framework for Scalable Causal Discovery with Unknown Interventions

Ong, Hans Jarett, Chikahara, Yoichi, Iwata, Tomoharu

arXiv.org Machine Learning

Uncovering the underlying causal mechanisms of complex real-world systems remains a significant challenge, as these systems often entail high data collection costs and involve unknown interventions. We introduce MetaCaDI, the first framework to cast the joint discovery of a causal graph and unknown interventions as a meta-learning problem. MetaCaDI is a Bayesian framework that learns a shared causal graph structure across multiple experiments and is optimized to rapidly adapt to new, few-shot intervention target prediction tasks. A key innovation is our model's analytical adaptation, which uses a closed-form solution to bypass expensive and potentially unstable gradient-based bilevel optimization. Extensive experiments on synthetic and complex gene expression data demonstrate that MetaCaDI significantly outperforms state-of-the-art methods. It excels at both causal graph recovery and identifying intervention targets from as few as 10 data instances, proving its robustness in data-scarce scenarios.





Amortized Bayesian Meta-Learning for Low-Rank Adaptation of Large Language Models

Zhang, Liyi, Snell, Jake, Griffiths, Thomas L.

arXiv.org Machine Learning

Fine-tuning large language models (LLMs) with low-rank adaptaion (LoRA) is a cost-effective way to incorporate information from a specific dataset. However, it is often unclear how well the fine-tuned LLM will generalize, i.e., how well it will perform on unseen datasets. Methods have been proposed to improve generalization by optimizing with in-context prompts, or by using meta-learning to fine-tune LLMs. However, these methods are expensive in memory and computation, requiring either long-context prompts or saving copies of parameters and using second-order gradient updates. To address these challenges, we propose Amortized Bayesian Meta-Learning for LoRA (ABMLL). This method builds on amortized Bayesian meta-learning for smaller models, adapting this approach to LLMs while maintaining its computational efficiency. We reframe task-specific and global parameters in the context of LoRA and use a set of new hyperparameters to balance reconstruction accuracy and the fidelity of task-specific parameters to the global ones. ABMLL provides effective generalization and scales to large models such as Llama3-8B. Furthermore, as a result of using a Bayesian framework, ABMLL provides improved uncertainty quantification. We test ABMLL on Unified-QA and CrossFit datasets and find that it outperforms existing methods on these benchmarks in terms of both accuracy and expected calibration error.